Phrase-based decoding is conceptually simple and straightforward toimplement, at the cost of drastically oversimplified reordering models.Syntactically aware models make it possible to capture linguisticallyrelevant relationships in order to improve word order, but they can bemore complex to implement and optimise.In this paper, we explore a new middle ground between phrase-based andsyntactically informed statistical MT, in the form of a model thatsupplements conventional, non-hierarchical phrase-based techniques withlinguistically informed reordering based on syntactic dependency trees.The key idea is to exploit linguistically-informed hierarchicalstructures only for those dependencies that cannot be captured within asingle flat phrase. For very local dependencies we leverage the successof conventional phrase-based approaches, which provide a sequence oftarget-language words appropriately ordered and ready-made with theappropriate agreement morphology.Working with dependency trees rather than constituency trees allows usto take advantage of the flexibility of phrase-based systems to treatnon-constituent fragments as phrases. We do impose a requirement ---that the fragment be a novel sort of "dependency constituent" --- onwhat can be translated as a phrase, but this is much weaker than therequirement that phrases be traditional linguistic constituents, whichhas often proven too restrictive in MT systems.
展开▼